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Abstract This paper proposes an algorithm to guide a formation of mobile 
robots, subject to communication constraints, from an arbitrary position to 
the location of the source of a physical signal in a planar environment. The 
information on the signal is only based on noisy measurements of its strength 
collected during the mission and the signal is considered to be weak and in¬ 
distinguishable from the noise in a large portion of the environment. The goal 
of the team is thus to search for a reliable signal and finally converge to the 
source location. An accurate estimation of the signal gradient is obtained by 
fusing the data gathered by the robots while moving in a circular formation. 
The algorithm proposed to steer the formation, called Gradient-biased Cor¬ 
related Random Walk (GCRW), exploits the gradient estimation to bias a 
correlated random walk, which ensures an efficient non-oriented search mo¬ 
tion when far from the source. The resulting strategy is so able to obtain a 
suitable trade-off between exploration and exploitation. Results obtained in 
simulated experiments, including comparisons with possible alternatives, are 
presented to analyze and evaluate the performance of the proposed approach. 

Keywords Multi-Robot Systems • Source Seeking • Robotic Sensor Networks 


1 Introduction 

Steering a team of autonomous mobile robots over the source of a physical 
signal is a well studied problem due to its numerous important applications. 
These can include environmental monitoring [12], search and rescue operations 
[9], odor source detection [11], pollution sensing [10], etc. In such scenarios, 
the robots are typically able to sense the environment, collecting and exchang¬ 
ing noisy measurements of the signal strength, and to exploit this information 
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Fig. 1 A multi-robot formation has the goal to reach the location of an emitting source. 
Initially far from the source (green team), the team firstly needs to explore the environment 
in search for a signal. When closer (blue team), noisy measurements of the signal can be 
exploited to estimate the gradient and guide the formation toward the source. 


to guide their motion, for instance through an estimation of the signal gradi¬ 
ent. However, in most of these applications the signal to locate can be weak 
or fast decay in space. As a result, in vast portions of the search space the 
signal-to-noise ratio is such that a reliable estimation of the gradient is im¬ 
possible to obtain and a non-oriented search strategy becomes essential to 
ensure success. Moreover, in presence of communication limitations (such as a 
limited communication range), the robots are not allowed to spread over the 
environment in an unconstrained way to speed up the search process. Moving 
in a predefined formation is a well-studied solution [1] for multi-robot teams 
to overcome this problem and also presents the important advantage to allow 
exploiting the known geometrical structure of the formation to improve the fu¬ 
sion of collected information. During the task, the formation has, then, to face 
the classical problem of exploration vs. exploitation, where a right compromise 
between exploring new areas to improve the current estimation and exploiting 
the poor available information needs to be found. To solve this problem, deter¬ 
ministic switching strategies between search and gradient descent approaches 
are too risky since being overconfident about an estimation can mislead the 
team, while a too conservative approach can be safer but not efficient in terms 
of convergence time, which is a crucial factor in many real-word applications. 

The main contribution of this work is to present a new strategy which 
instead includes both behaviors, i.e. gradient-based signal exploitation and 
non-oriented search, and is able to continuously pass from one to another via 
a probabilistic scheme. The cooperative gradient estimation is achieved by a 
weighted combination of robots’ measurements gathered while they are moving 
on a circular formation and is based on the results recently presented by one 




Search and Localization of a Weak Source with a Multi-Robot Formation 


3 


of the authors in [5]. This configuration allows the team to obtain an accurate 
gradient approximation and to estimate the relative error as a function of the 
formation radius. This approach is embedded in a search strategy based on 
correlated random walks, ensuring an efficient exploration of the environment 
in search for an exploitable signal. Finally, it is worth mentioning that the low 
computational burden of this approach makes it suitable for implementation 
on very light and constrained robotic platforms. 

The rest of the paper is organized as follows. Section 2 provides an overview 
on work related to non-oriented search and source-seeking problems. The for¬ 
mulation of the specific problem tackled in this paper is then presented in 
Section 3. Section 4 aims to briefly describe the exploration and gradient 
estimation approaches which compose our algorithm and Section 5 presents 
how these strategies are combined to solve the source-seeking problem in the 
proposed Gradient-biased Correlated Random Walk (GCRW) algorithm. In 
Section 6 this algorithm is then tested and validated in simulations and re¬ 
sults are provided and discussed. Section 7 concludes this paper with final 
discussions and insights on possible future works. 


2 Related Work 

The problem of using mobile sensors to locate an unknown target (source) 
while minimizing the search time has been widely studied. In absence of an 
a priori information on the target location, non-oriented strategies need to 
be employed to explore the search space. In this scenario, stochastic motions 
can outperform deterministic strategies and mix local search and global explo¬ 
ration. The most basic of these motions, the simple random walk, presents a 
diffusive behavior which tends to oversample the search space, and so achieving 
slow, inefficient exploration. However, more complex processes such as Levy 
Walks (LW) and Correlated Random Walks (CRW) have been widely stud¬ 
ied in this context, both as optimal search strategies for autonomous agents 
and to reproduce the movements of several animals in search for prey. A LW 
is defined by a uniformly random selection of a new orientation and a step 
length which follows a power-law distribution pe(£) ~ £~ a , with 1 < a < 3. 
For a > 3 the movement becomes eventually Gaussian distributed and so the 
motion is Brownian. Instead, a < 1 does not corresponds to normalizable dis¬ 
tributions. The case of 1 < a < 2 is the most interesting where the motion 
grows ballistically as ( x 2 ) ~ t 2 . In particular, Viswanathan et al. prove in [21] 
that a ss 2 is the optimal value for a search in any dimension when targets are 
uniformly distributed over the environment. A more comprehensive review of 
results and insights involving the use of LWs in non-oriented searches can be 
found in [19]. LWs have been also proposed for multi-robot search problems 
as in [17]. where an artificial potential field is added to create a repulsive force 
among robots to improve the dispersion process during the search mission. 

An alternative motion with similar properties is the CRW, where a per¬ 
sistence in the motion orientation generates a correlation between successive 
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positions (CRWs are more extensively presented and rigorously defined in Sec¬ 
tion 4.1). The relation among these two stochastic processes has been analyzed 
in [15] and in [2] a study on the search efficiency in terms of motion correlation 
is presented. 

The literature on source seeking problems with mobile robots has rapidly 
grown in recent years. Interesting and extensive reviews on source localization 
approaches for environmental monitoring can be found in [7] and, for more 
recent developments, in [3]. A vast family of algorithms is based on the idea 
of Chemotaxis, where a single or multiple robots follow the - estimated - local 
gradient to locate the source. In [6], Chang et al. propose a multi-agent ap¬ 
proach to transform the detected turbulent plume field into a smoother scalar 
field, conserving the same source but with a better defined gradient. All the 
Chemotaxis-based approaches rely on the presence of a well-defined gradient 
in the signal field and cannot manage a signal completely covered by noise. A 
study of reactive single-robot algorithms which deal also with the search part 
is presented in [16] for the problem of plume detection, where information is 
still present but only in an intermittent and sparse way. Considering the same 
scenario, i.e. where patches of detectable odors or chemical components are 
diffused over the environment by turbulent flows, a gradient-free method called 
Infotaxis was presented in [18]. In this algorithm, the search for the source is 
guided by a maximization of the expected rate of information gain defined 
through Shannon’s entropy. In [8], an extension of this approach for multi¬ 
robot systems has been recently proposed. Pasternak et al. presented in [13] a 
different approach for the same problem but based on LWs to exploit the cues 
detected in the flow. These strategies, known as Levytaxis, share with ours the 
use of similar stochastic motions to guide the searcher but are strongly based 
on the presence of a current or air flow which could guide the searcher toward 
the source. In [13] for example, no measure of a signal strength is used for the 
strategy, which is, instead, based only on the flow structure. In our problem, 
we do not assume the presence of any detectable current in the environment. 

This paper aims to deal simultaneously with the two aforementioned prob¬ 
lems, proposing a new probabilistic scheme to mix non-oriented optimal search 
with gradient-based source localization for a team of mobile robots. This global 
strategy significantly extends the contribution presented in [5], where only 
the gradient-based part is presented and no possibility to deal with highly 
noisy scenarios was present. Note that, although many optimization algorithms 
adopt a global search strategy to find a coarse solution and/or escape local op¬ 
tima mixed with a local search to refine the solution and possibly converge to 
the global optima, these solutions are not trivially adaptable to the considered 
problem. In fact, even though the two domains are strictly related, in most 
of the optimization techniques, the way to include global properties does not 
take into account the constraints of a physical system called to explore a real 
environment, such as: cost of traveling, gathering measurements, smoothness 
of trajectory, etc, contrary to our case. For these reasons, we believe the nov¬ 
elty in the contribution of this paper can be identified not only with respect 
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to the previous publication [5] but, more largely, to the current literature on 
the subject. 


3 Problem Formulation 

The objective of this work is to design a strategy to steer a team of cooperating 
robots with limited communication capabilities to the location of a signal 
source in a planar bounded environment. The robots do not have any prior 
knowledge on the source location and acquire information on-line through 
noisy measurements of the signal strength. 


3.1 Robots’ Formation 


In order to enhance the robots capability to rightfully estimate the signal 
gradient as a team, we exploit their collaboration by deploying them in uni¬ 
formly distributed patterns along a circular formation. We consider here such 
a configuration for two main reasons. Firstly, as discussed more thoroughly 
in Section 4.2, fusing measurements collected by a group of sensors uniformly 
distributed in a circle allows obtaining an accurate estimation of the gradient 
at the center of the circle. Secondly, many commonly used underwater and 
aerial robots are not able to stop at a given position (the source location in 
our case) due to their physical constraints, for example if their linear velocity 
cannot be reduced to zero. In these cases, a circular motion is a perfect way 
to encircle the source when located, respecting the system constraints. 

Let us consider a group of N robots stabilized to a planar uniform dis¬ 
tributed circular formation described by a radius R, a rotation angle tp® = u 0 t 
where ui o is a constant rotation velocity, and a given center point c £ M 2 . The 
position of robot i at instant t is given by the following equation: 


r\=c t + R 


cos ip' l t 
sin tp\\ ’ 




(1) 


where 

?7T 

V l ,=J + i y (2) 

is the rotation angle. 

There are several works dealing with the control of circular formations 
for mobile robots. If the robots represent non-holonomic vehicles modeled by 
unicycle-like dynamics, the cooperative circular control law presented in the 
previous work [4] ensures the convergence of a team of robots to a circular for¬ 
mation whose center is time-varying. Inspired by the synchronization problem 
of coupled oscillators, the introduction of a potential function in the control 
law makes the robots converge to an evenly spaced configuration, i.e. with the 
rotation angle satisfying eq. (2) for i = 1,..., N. In order to achieve this 
uniform distribution along the circle, the cooperative control law relies only 
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on local information, specifically on the relative heading angles between each 
robot and its neighbors. This local information can be directly measured by 
the robots or transmitted via a communication network. 

For the sake of clarity, we present here the main ideas of the time-varying 
circular formation control proposed in [4]. In the proposed approach, the key 
idea is to generate circular trajectories with a properly chosen, stable virtual 
system and to enforce the robots to track them. The autonomous virtual sys¬ 
tem is modeled by unicycle dynamics with constant linear velocity as follows: 


r \ =|w 0 | R 

=ul 


COS ll>t 

sin ipl 


( 3 ) 


where r( is the position vector of the virtual agent i at time t, ipl its heading 
angle and u\ the control input. Communication between agents is considered in 
order to achieve a uniform distribution around the desired circular formation. 
In the set-up adopted in this paper, each robot communicates the virtual 
quantity to its neighbors through an undirected communication graph. A 
limited range communication can be considered in this case, since the robots 
only need to communicate with their two nearest neighbors. Each virtual agent 
is stabilized to a fixed circular motion with radius R and angular velocity 
around the origin thanks to the following circular control: 


dU 


=u; 0 (l + K r t T t )-— V 

=TKT\ sin (^t - V’t). 
I-™*I ,'cW. 


( 4 ) 


where k > 0 and k u > 0 are two control parameters, A/j is the set of neighbors 
of agent i and A",; denote its number of neighbors. Using Lyapunov techniques 
and thanks to the LaSalle’s Invariance Principle, it can be proven that the 
virtual system dynamics converge asymptotically to: 


v\ = uj 0 Rr] 


( 5 ) 


which corresponds to a circular motion around the origin. Additionally, the 
gradient term enforces each virtual agent to move away from its neighbors 
until the equilibrium point, corresponding to the evenly spaced configuration 
in the case of connected graphs, is reached (details of the proof can be found 
in [4]). The circular trajectories of the autonomous virtual system r* are then 
considered as references to be tracked by the relative vectors v\ — c t . A simple 
tracking controller can be designed to enforce robots’ dynamics to converge to 
the desired trajectory r\ + c t and consequently to stabilize the robots to an 
evenly spaced circular motion around the time-varying center c t , as described 
in (1). 
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3.2 Signal Strength 

Each robot represents a mobile sensor or a vehicle equipped with a sensor that 
is able to measure the signal strength emitted by the source. In mathematical 
terms, the signal distribution emitted by the source is a bi-dimensional spatial 
function representing the scalar Held with a maximum (or minimum, depend¬ 
ing on the nature of the signal) in the position where the source is located. 
The distribution of the signal strength in the environment is described by an 
unknown positive spatial mapping cr(r) : M 2 —> M + , so that robot i measures 
the signal strength at time t at its position r) as cr(rj). We do not consider 
here the case of multiple emitting sources in the environment, i.e. the source 
located at r* is the only maximum of the scalar field. 

In this work, we focus on scenarios where the signal strength and its decay 
in space compared with the dimension of the search area and noise in measure¬ 
ments is such that it can be considered zero and/or indistinguishable from the 
noise in a large part of the environment. The model of the signal is considered 
to be unknown and the robots are only able to measure the signal strength at 
their current locations. 

Let Vcr(r) G R 2 denote the gradient vector at r and H cr ( r ) the correspond¬ 
ing Hessian matrix. We assume that the signal strength is smooth enough to 
consider that the Hessian matrix is bounded. This assumption allows a large 
class of functions to represent the signal strength of the scalar field of inter¬ 
est. For instance, several physical quantities, as electrical and magnetic fields, 
light, radiation and sound follow inverse-square laws. Therefore, in these cases, 
the intensity of linear waves radiating from a point source is inversely propor¬ 
tional to the square of the distance from the source, satisfying the smoothness 
assumption. Diffusion processes as temperature and chemical concentration 
also satisfy this assumption. 


4 Source Localization 

The studied problem could be ideally separated in two different sub-tasks: the 
signal search phase and the subsequent source estimation. In this scenario the 
robots should switch strategy when the first objective is achieved. However, the 
presence of noise makes the knowledge on the signal strength often unreliable 
and employing a single strategy able to optimally carry on both phases and 
continuously pass from one to the other can be a more suitable and robust 
strategy. In particular, such a solution can guide the team during the transition 
between the two phases, where a strong signal is not yet available but weak 
noisy measurements can still help to bias the search strategy improving the 
results. 

The proposed approach is based on CRWs biased by the estimation of 
the signal gradient. Let us first describe the two different strategies and in 
Section 5 we present the proposed algorithm which combines both of them via 
a probabilistic scheme without the necessity of discontinuous transition. 
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Fig. 2 CRWs in a bounded environment with a different value of correlation: left k = 1, 
middle k = 5 and right k = 50. The blue stars indicate the starting points, the green points 
the ending ones. 


4.1 Exploration: Correlated Random Walk 

A CRW is a stochastic motion characterized by a short-term correlation in the 
direction of movement. Such a correlation between successive step orientations, 
also known as persistence, produces a local biases and the walker is more likely 
to maintain a similar direction to its previous one. This fundamental statistical 
property is mathematically described by a unimodal probability distribution of 
angles centered around the previous direction. Angular distributions, defined 
on periodic intervals, are inherently different from classic distributions and are 
usually constructed by wrapping the usual definition on the real line, i.e.: 


+oo 

fwrap{9) = ^2 /(0 + 2irk ), 6 £ [— 7T, 7r) . (6) 

k——oo 


However, closed form expressions following this definition are possible in very 
few cases. In this paper, we adopt the well known von Mises distribution, 
which is a close version of the wrapped normal distribution: 


m 


1 -fc(fl-M) 
2nl 0 (k) 


( 7 ) 


where Jo(-) is the zero -th order modified Bessel function of the first kind, k 
regulates the process correlation (1 /k is the analogous of the variance in the 
normal distribution) and the mean is, in our case, the previous step orientation 
M = Ot- 1- 

In terms of movement steps, as for the simple random walk, their length is 
constant at every time, and hereafter considered unitary without any loss of 
generality. It is worth mentioning the difference with respect to the aforemen¬ 
tioned Levy walks, where steps have a varying length, following a determined 
probability distribution, and the directions are uniformly distributed in [0, 2tt). 
Additionally, if an external bias is also present, influencing the determination 
of the new direction, the resulting process is usually known as Biased Corre¬ 
lated Random Walk (BCRW) [14]. 
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The presence of the motion inertia has crucial effects on the exploration 
properties of CRWs. In fact, for limited time scales, shorter than a certain 
correlation time r, CRWs can be considered super-diffusive processes, and so 
significantly outperforming standard Brownian motions in terms of covering 
capabilities. Then, at sufficiently larger times a normal diffusion behavior al¬ 
ways emerges [20]. This phenomenon can be proved by considering the Mean 
Square Displacement MSD = ((x — Xq) 2 ), which can be considered a measure 
of the portion of explored space by a stochastic motion. For long enough time 
t, RWs scale as t v , where v = 1 in the standard Brownian motions. On the 
other hand, in CRWs v > 1 for t < r (and in particular v = 2 for continuous 
time CRWs) [2]. An explicit expression for r as a function of the correlation is 
given by r ~ —1/ln((cos(0))), where (cos(0)) = Ii(k)/Io(k) for the von-Mises 
distribution [2]. 

Fig. 2 shows an example of CRWs corresponding to increasing value of 
correlation k, namely for k = 1,5,50. It is clear how a higher persistence in 
the orientation reduces the path tortuosity and controls the trade-off between 
local and global exploration. 


4.2 Exploitation: Gradient Estimation 

Besides to a non-oriented exploration strategy to search for a signal, to effi¬ 
ciently solve our task we also need a reliable gradient estimation technique to 
exploit the information when available. To tackle this problem we rely on the 
results presented in [5]. In presence of noisy signal measurements, a circular 
formation has been proven to be a suitable configuration to cooperatively es¬ 
timate the gradient vector. This estimation is then used to guide the center of 
the formation towards the source allowing the team to deploy around it and 
estimate its location with precision. During its motion, each robot measures 
the signal strength at its current location and transmits this information to the 
rest of the fleet in order to cooperatively estimate the gradient. The following 
lemma formalizes the result on cooperative estimation of the signal gradient 
at the center of a circular formation: 

Lemma 1 Let a : R 2 —» R + be a bounded function and er(r*) be the measure 
collected by robot i, where r* is its position vector given by (1). Considering a 
fleet of N > 3 robots uniformly distributed along a circular formation centered 
at c with radius R, the following equation for the estimated gradient Ver(c) 
holds: 

2 N 

Vcr(c) = cr(r i )(r l - c) = Vcr(c) + V(R, c), (8) 

i—1 

where the approximation error term SI '(R, c) satisfies 
11 SF( R^ c)|| < A max{H a {r))R, 

with \ max representing the largest eigenvalue of the Hessian matrix. 
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The details of the proof, based on Taylor series expansion, can be found in [5]. 
It is important to highlight that this result holds only for circular formations. 
The symmetric properties of an evenly spaced circular formation of robots, 
in particular the fact that the robots’ positions satisfy r * — c = 0, are 

crucial to prove the previous lemma. Note that in this previous work it is also 
proven that for quadratic signals, which represent a good approximation of 
the true signal near the source, the computation of the gradient is exact, i.e., 
eq. (8) is satisfied with c) = 0. 

As a result, as proven by Lemma 1, the interest of using a circular formation 
is to exploit its symmetric properties in a way that the gradient estimation 
can be obtained via simple averages that involve only the products of the 
measurements collected by the robots and their relative position with respect 
to the formation center. Note that the smoothness assumption on the signal, 
i.e., that the Hessian matrix is bounded, is only required in order to obtain a 
bound on the approximation error. 

We consider that the signal measurements collected by the robots are cor¬ 
rupted by white zero-mean Gaussian noise, i.e., a(r' l )+u) 1 where w‘ ~ N( 0, 
Both the expectation and the variance of the additional estimation error due 
to the noise are studied in [5]. It has been proven that the expectation is not 
affected by Gaussian noise, i.e. 


E 


2 N 

— T 

K 2 N ^ 

i—1 


H O+u^-c) 


= Ver(c) 


(9) 


and the variance of the error induced by the noisy measurements is inversely 
proportional to the radius squared, i.e. 


Var 


N 


R 2 N 


5>V- C ) 


R 2 N 


( 10 ) 


Therefore, the greater the radius value, the smaller the influence of noise in the 
gradient estimation. However, as proven in Lemma 1, the error term ^(R, c) 
vanishes when the radius tends to zero. Consequently, we conclude that the 
radius has an important role in the precision of the gradient estimation and 
for the attenuation of the noise effects. 

The signal gradient estimated by the circular formation of robots provides 
useful information to be exploited by our proposed searching strategy. The 
direction and norm of the estimated gradient can be applied as external bias 
of a CRW-based algorithm. Based on previously presented results on gradient 
estimation, we are now able to compute the direction of the estimated gradient 
Vcr(c) at each time step as 

Of = arctan ^e^Ver(c)/efVer(c)^ , (11) 

where ei = [1,0] T and e 2 = [0,1 ] T are basis vectors of the global inertial 
coordinate frame. 
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Fig. 3 Activation function in a with ctq = 5 and 7 = 10, 5, 3. 


5 GCRW Algorithm 

The main idea behind the proposed algorithm is to design a probabilistic 
scheme, which allows a continuous transition between the two previously pre¬ 
sented approaches for exploration and exploitation of the detected signal, to 
steer the center of the circular formation. This transition needs to be regulated 
by the signal-to-noise ratio, which represents an indication of how reliable is 
the collected information. To quantify this concept of reliability, we introduce 
the following function at'. 

a t (a) = -L=- T cos 2 (, (12) 

U ' 1 + e -7(<7t-<70) \ 2 J V ' 

where a is the mean signal measured by the robots, i.e. 

1 N 

= ( 13 ) 
i=l 

The first term in a is a classic choice for an activation function, where 7 and 
(Jo are user-defined parameters defining the sigmoid’s steepness and midpoint 
respectively (see Fig. 3). The second term tends to trust more gradient esti¬ 
mations which are consistent in two consecutive iterations. In other words, is 
maximum where the two estimated gradients are collinear and tends to zero 
when the angular difference tends to n. 

The function a is then used to construct a new probability distribution for 
the random walk orientation, where the estimation of the gradient is combined 
to the von Mises distribution, playing the role of a bias. The final probability 
distribution is so defined as: 

Pt(0) = (1 - a t )$ t {9) + a t 6(9 - 9 G ), 


(14) 
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where <£(•) is the von Mises distribution defined in eq. (7), <$(•) is the Dirac delta 
distribution and is the estimated gradient direction defined in eq. (11). In 
the limit cases of a very weak or very strong signal, the effects from the gradi¬ 
ent estimation and from the random motion, respectively, become irrelevant. 
However, in the more interesting intermediate zone, the probabilistic selection 
of the strategy weighted by the function a produces a smooth behavior and a 
more robust response to random fluctuations generated by the noisy sensors. 

In an analogous way, the distance traveled by the formation center at each 
iteration is given by: 

A t = (1 - a t ) + a t v || Vcr(c) || , (15) 

where v is the classic step size of the gradient descent algorithm and || • || 
the Euclidean norm. Additionally, a maximum speed constraint is considered, 
i.e. At < A max . The expression in (15) ensures a unitary movement in the 
exploration phase and a step depending on the gradient in the opposite limit 
case to guarantee convergence onto the source. 


6 Simulation Results 

In this section, we test the GCRW algorithm in simulations where the robots’ 
formation is called to locate a signal source in a bounded environment. Dif¬ 
ferent signal models are considered to show the efficiency of the proposed 
approach. Particular attention is given to the study of the effect of the corre¬ 
lation in the random walk process and the robustness of the algorithm with 
respect to changes in its parameters. The performance is evaluated in terms 
of success rate in locating the source given a limited time budget and average 
time to complete the mission. In all simulations, the robots move in a bounded 
environment, with a maximum speed of 1 m/s, where a source of unknown po¬ 
sition emits a signal whose strength prevails over the noise only in a limited 
region. The team is composed of six robots moving in a circular formation of 
radius R = 5m 1 . The noise on the measurements is modeled as a zero-mean 
Gaussian noise, where the variance will be specified in each case. 


6.1 Illustrative examples 

Fig. 4 presents a first illustrative example showing the importance of including 
an exploration strategy to the gradient-based source seeking approach in such 
a scenario. In this case, the scalar field representing the signal, presenting 
non-convex level curves and having its source in ( 0 , 0 ), is given by: 

cr(r) = a ^exp (—r T Sir) + exp r r 6 >TS , 2 @|r^ , (16) 

1 For an exhaustive analysis of the source seeking algorithm varying the number of robots 
and formation radius see [5]. 
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Fig. 4 The robots move in a bounded square region of side 250m with a maximum speed 
of 1 m/s. Top: at the team initial position the noise prevails over the signal and with only 
the gradient estimation the robots fail to find the source. Bottom: The GCRW algorithm 
allows the team to start exploring the environment until a reliable signal lead to correctly 
locate the source. 
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and (9(.) representing the rotation matrix. At the starting position the signal 
is too weak compared to the noise (here having a standard deviation £ = 0.5) 
and the team is not able to obtain a consistent gradient estimation to guide 
its search. The resulting trajectory is thus only a random movement close to 
the initial position and the robots fail to locate the source. On the other side, 
by adopting the GCRW algorithm, the initial signal strength is such that the 
exploration component initially prevails and the team is finally able to find the 
signal and progressively be led by a reliable gradient estimation, converging 
onto the source. 

Fig. 5 presents a scenario with a more complex signal field, including multi¬ 
ple sources (maxima). As expected, the gradient-based nature of the algorithm 
assures the convergence to a local maximum and the result is so depending 
on the initial configuration. However, it is worth noting that the stochastic 
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Fig. 5 More complex signal presenting multiple sources. Due to the stochastic nature of 
the exploration phase, the team can converge to a different source even starting from the 
same initial conditions. 



Fig. 6 Rate of failure in locating the source given a maximum mission time of 3000s as a 
function of the initial distance from the source for four different values of the random walk 
correlation k. 


exploration of the non-informative region can lead the formation to converge 
to different maxima even when starting from the same initial conditions. 


6.2 Algorithm Evaluation 

To give more quantitative information on the importance of having correlation 
in the random walking process to deal with the search phase, we carried out 
the following test. Given a maximum time budget to achieve the mission, we 
compare the rate of failures in encountering the source as a function of the ini¬ 
tial distance of the formation center from the source for different values of the 
parameter k. which regulates the correlation as shown in Fig. 2. Considering 
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a maximum mission time is fundamental in many real applications where the 
limited energy budget of the robots can represent a hard constraint to deal 
with and/or a fast localization of the source can be crucial. The results are 
presented in Fig. 6 and correspond to the average obtained over 500 trials, for 
a maximum time budget of 3000s. The parameters in a(a) are fixed as op = 3 
and 7 = 10. From this result, it is clear that, starting from a distance greater 
than 75 m the exploration becomes crucial and low values of correlations lead 
to a poor covering capability and, consequently, search efficiency. 

A second, and even more important aspect to investigate is the effect of 
the probabilistic scheme employed to combine the exploration and exploita¬ 
tion phases. To do so, we compare our approach with an alternative strategy 
where the choice between the stochastic search and the gradient-based algo¬ 
rithm is simply based on a deterministic switch governed by the current signal 
estimation 2 . In this framework, a particular attention is given to the effects of 
the choice of the parameters op and 7 in the activation function a defined in 
eq. ( 12 ) and the importance of the term depending on the angular difference 
with the previous gradient estimation. To study these effects we consider a 
different signal, with a slower decay and a higher measurement noise (£ = 1 ). 
In this way, besides the pure exploration phase in search for a signal, there is 
also a larger zone where the signal is of the same order of the noise, and so 
where the gradient is more difficult to estimate correctly. More formally, the 
signal is modeled as a Cauchy-like distribution: 


cr(r) 


b 2 

a x 2 + y 2 + b 2 


(18) 


where the a and b have been fixed to 10 and 30 respectively and the source 
location is also in this case in ( 0 , 0 ). 

A first analysis is carried out varying the value of uo and comparing the 
failure rate for a fixed mission time of the two different strategies: i) the stan¬ 
dard GCRW algorithm and ii) the deterministic switch where the probability 
at simply becomes a binary variable as follows: 


at(cr) 


1 if <j t > cr 0 
0 if a t < do 


(19) 


For the GCRW algorithm, three instances corresponding to three different val¬ 
ues of 7 have been taken into account, namely 7 = 3,5 and 10. The results are 
provided in Fig. 7, where 500 trials, with an initial distance of the formation 
from the source of 200?n, have been analyzed. A first important considera¬ 
tion is that the failure rate for the proposed GCRW algorithm is zero from 
any value of erg greater than 3, while basically 1 for cr 0 < 2. This can be in¬ 
terpreted as follows: for values lower than 2 , the system trusts the gradient 

2 We would like to remark that direct comparisons with the more standard source-seeking 
algorithms presented in Section 2 are not feasible since they rely either on a reliable gradient 
estimation or on a flow transporting the signal, neither of them available in the studied 


scenario. 
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Fig. 7 Failure rate for a mission time of 10 3 s as a function of <jq. In the GCRW algorithm 
at is defined as in eq. (12), while for the deterministic switch the expression reported in 
eq. (19) is used. The results correspond to 500 trials. 


estimation even in regions where it is absolutely unreliable (signal too close to 
noise level), misguiding the formation; on the other side, larger values are safe 
margins to start isolating the signal from the noise level. It is also worth noting 
that the exploration capabilities of the team are sufficient to eventually locate 
the source even for very conservative choices of ctoj showing the robustness of 
the algorithm. In many real cases, a reliable knowledge of the signal strength 
and/or the noise level is indeed not precisely available. As a result, a suitable 
choice of this parameter is not easy and a conservative value may be a safer 
choice. Secondly, it is already possible to see a difference between the deter¬ 
ministic switch and the GCRW algorithm. Especially for lower values of <to, 
the former is significantly less stable, trusting too much inaccurate gradient 
estimations, leading to higher failures rates, while the results for different 7 
values do not differ much from each other. 

The difference between these strategies becomes then more clear consid¬ 
ering the average time required to locate the source. For the same scenario, 
we restrict our attention to the values of oo for which failures are negligible 
and the mission times are averaged over the same 500 trials. Here, we can see 
that the probabilistic switch employed by the GCRW algorithm, including the 
trust term expressing the correlation between consecutive gradient estimation, 
makes the convergence to the source faster in the entire range of analyzed val¬ 
ues of a 0 and regardless the choice of 7 with respect to a deterministic switch. 
Moreover, the lowest value reached is provided by the GCRW algorithm with 
7 = 3, i.e. corresponding to the smoothest activation function and so the 
smoothest transition between the two modes. As a final conclusion, we can 
note that, even though an optimal value is clearly present for every different 
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Fig. 8 Average time necessary to complete the mission as a function of no. The reported 
values are averaged over 500 trials. 


7 value, the average mission time varies only slowly in terms of <To, showing 
again a certain robustness with respect to its choice. 

Finally, we want to discuss how standard optimization algorithms might 
be implemented in this case and how they would perform. The important con¬ 
straint that is hard to match is to drive the entire team to the source and not 
only a single robot or a subset of them. Without any communication constraint 
this could be obtained with a large spread of the robots over the environment 
allowing also a larger search capabilities and approaches such as the Parti¬ 
cle Swarm Optimization (PSO) algorithm could be suitable for the problem. 
However, taking into account strong limitations in the communication (i.e. a 
maximum communication range), these solutions are not trivially adaptable 
anymore. 

A different option is represented by global gradient-free optimization algo¬ 
rithms, such as Simulated Annealing based approaches. However, we consider 
here on-line applications where, to test possible candidate states, the robots 
need to physically travel to take measurements before deciding whether or not 
accept them. This produces very irregular and expensive paths. To show an 
example, we tried to adapt this method to our scenario: keeping the circular 
formation, at each iteration the maximum signal strength measured by the 
robots is taken into account for the optimization and, when accepted, the po¬ 
sition of the corresponding robot becomes the new formation center. Then the 
formation follows a standard simulated annealing algorithm. Fig. 9 shows two 
typical solutions obtained by the optimizer in the same scenario presented in 
Fig. 1. Clearly, the need to test many possible states to converge to the final 
- global - optimum makes the system travel much more (and in a more irreg- 
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Fig. 9 Two instances of final path obtained by a simulated annealing approach to converge 
on the source. 


ular way) than with our approach. Including the traveling time, the resulting 
average convergence time is in this case several times the one obtained with 
the GCRW algorithm. 


7 Conclusions 

A new strategy to search for a weak, noisy signal and locate its source using 
a team of mobile robots has been presented in this paper. The robots move in 
a circular formation and exchange information to cooperatively estimate the 
signal gradient based on noisy measurements, while respecting communication 
constraints. The formation center follows a correlated random walk which 
ensures good space filling properties to efficiently search for a zone of reliable 
signal strength. To deal with the exploration vs. exploitation problem, the 
estimated gradient plays the role of a bias in the probability distribution of 
the random walk direction, allowing the team to have a smooth transition 
between the two strategies and to increase the robustness with respect to a 
deterministic switch. Results in simulations showed the effectiveness of this 
algorithm. 

In the future, we intend to better investigate possible variations in the 
algorithm to increase its performance, such as a varying formation radius along 
the mission, which would allow the team to improve either its exploration or 
estimation capability depending on the measured signal. An adaptive selection 
of fro instead of an a priori choice is also a line worth investigating further. 
Finally, studying different signal models, relaxing the convexity assumption 
and including the capability of locating multiple sources would be relevant 
extensions for many real applications. 
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